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Abstract- Hyperspectral imaging has 
been largely utilized in applications 
involving remote sensing to describe 
the composition of thousands of 
spectral bands in a single scene. 
Hyperspectral images (HSI) require 
an accurate training model for 
extracting the characteristics of 
scenes presented in an image. Image 
learning models involving spectral 
resolution present major challenges 
because of the complex nature of 
image frames. Several attempts have 
been made to address this 
complexity. Nevertheless, these 
models have failed to retain a deeper 
understanding of hyperspectral 
images. Since there are mixed pixels, 
limited training samples, and 
duplicate data, so the deep learning 
method solves the problem.In this 
method, spectral values (for every 
pixel) of the hyperspectral images 
are sequentially fed into spectral 
long-short-term memory (LSTM) 
through several routes to study the 
spectral features. Most of the existing 
state-of-the-art models are based on 
spectral-spatial frameworks. The 
added spatial features add more 
dimensions to hyperspectral images. 
However, these classification models 
do not take advantage of the 
sequential nature of these images. 


Due to the presence of mixed pixels, 
limited training samples, and 
redundant data, the utilization of 
deep learning techniques addresses 
the problems. This paper describes a 
method for the classification of 


hyperspectral images through 
spectral-spatial LSTM networks. For 
extracting the first principal 


constituent from such an image, 
principle component analysis (PCA) 
was used in spectral and spatial joint 
feature networks (SSJFN), as well as 
spectral and spatial individual 
extraction of the features via LSTM, 
to get the uniform end-to-end 
network. Furthermore, it was aimed 
to achieve the integration of all 
processes in a neural network by 
making a classifier to overcome the 
training error and backpropagation, 
which may lead to learning more 
features. During categorization, 
SoftMax classification considers the 
spatial and spectral characteristics of 
all the pixels independently to get two 
different outcomes. Afterwards, 
joint spectral-spatial results are 
gained by using the strategy of 
decision fusion. The classification 
accuracy improves by 2.69%, 1.53%, 
and 1.08% when compared to the 
rest of the state-of-art methods. 
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I. Introduction 


Hyperspectral imaging is a 
three-dimensional data cube. It 
consists of 1-Dimensional spectral 
data about spectral bands, as well as 
2-Dimensional spatial data about 
image features. In particular, 
spectral bands occupy a very small 
wavelength. At the same time, 
image features, such as the 
landcover feature and shape features 
show inconsistencies and 
relationships between adjoining 
pixels in different directions at the 
same wavelength. 


Simultaneously obtaining 
images with high spatial and 
spectral resolutions has been easier 
and more useful after the 
development of —hyperspectral 
sensors. From being used to monitor 
the surface of the Earth to being 
applied in agriculture, chemical 
imaging, environmental sciences, 
and physics-related fields, 
hyperspectral data has become a 
significant tool that requires 
identifying the label of every pixel 
through image classification [1, 2]. 
Several strategies for hyperspectral 
image (HSI) classification have 
been proposed. Conventional ways 
such as k-nearest neighbors (KNN) 
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frequently utilize knowledge about 
the spectral characteristics, resulting 
in a “dimensionality curse” [3]. 


Resolving this issue requires 
methods for dimensionality 
reduction, such as principle 


component analysis (PCA) [4, 5] 
coupled with linear discriminant 
analysis (LDA) [6, 7]. According to 
[8], the support vector machine 
(SVM) method has been used for 
HSI classification. Although, it has 
limited sensitivity to input along 
with the dimensions of highness and 
shortness in the size of the sample. 
As compared to other methods, [9] 
SVM-based classifiers can usually 
perform more effectively. Still, 
SVM remains a superficial 
architecture. Shallow architectures 
are useful in a variety of simple or 
constrained issues. Whereas, in 
complex cases, their limited 
modeling and representing capacity 
remains inadequate, as described in 
[10]. Great success has been 
achieved in a range of machine- 
learning tasks using deep-learning 
techniques in recent years as per [3], 
thanks to the advancements in 
computing power and the 
availability of wide-ranging 
datasets. CNN, due to its limited 
connectivity with the properties that 
share weightage [10, ll], is 
acknowledged as being a state-of- 
the-art characteristic in the method 
of extraction, keeping in view a 
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variety of tasks which include 
computer vision [2]. Moreover, 
recurrent neural networks (RNNs) 
[12] along with their varied points 
have been extensively utilized with 
the data of sequences related to 
model applications, such as the 
recognition of spoken discourse and 
translation [1, 13]. 


Deep learning has gained 
importance in recent years in the 
remote sensing community, 
particularly in the classification of 
HSI [14, 15], for instance, when a 
stacked autoencoder model was put 
forward for unsupervised extraction 
of high-level features [1]. Tao et al. 
offered an enhanced version of the 
autoencoder model [16], which 
included a regularization concept 
regarding the role of energy. 


As stated by [5], a deep belief 
network (DBN) was used for the 
extraction of features. It was 
followed by the classification of 
results using logistic regression 
classifier to which the inputs of the 
models were high-dimensional 
vectors. An alternative method for 
learning the spatial feature from an 
HSI is to flatten an image patched in 
the nature of locality in a vector. 
Even so, the technique might result 
in the loss of spatial information by 
destroying the two-dimensional 
structure of images. The study [17] 
presents a two-dimensional CNN 


model to resolve this concern, 
which may lose spectral information 
for using the first element of an HSI 
as the value input. Three- 
dimensional CNN [18] uses local 
cubes as inputs to learn both spectral 
and spatial features, simultaneously. 
Presumably, hyperspectral data 
correlates different spectral bands as 
they are compactly sampled from 
the entire spectrum. Firstly, it is easy 
to notice that adjacent spectral 
bands in any material are likely to 
have extremely similar values, 
implying that neighboring spectral 
bands largely rely upon one another. 
Additionally, some materials 
exhibit long-term interdependence 
amid neighboring spectral bands 
[13]. As described in this study, 
every single hyperspectral pixel is 
approached as a sequence of data 
and it represents the spectral domain 
dependency using long-short-term 
memory (LSTM) [19]. Like spectral 
channels, image pixels in the spatial 
domain are interdependent. Hence, 
LSTM may be utilized to extract 
spatial information as well. SoftMax 
classifiers are fed into spectral and 
spatial data extracted for every 
pixel. Combining the categorization 
yields the combined spectral-spatial 
product. 


The rest of the article is laid out 
in this manner. The authors go 
through the fundamentals of LSTM 
in Section 2. Section 3 covers the 
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suggested method in depth. Section 
4 discusses the trials and Section 5 
offres the conclusion. 


Long Short-Term Memory 
(LSTM): Recurrent neural networks 
deal with the learning of problems in 
sequence by involving the edges of 
a recurrent node for joining the 
neuron to itself during different 


times [20]. Consider that the 
sequence of input {x,,x,,...,x;} 
states of hidden layers are 
eee Therefore, an 


input x; is received at time ¢ by a 
recurrent edge node, along with the 
value of its previous output h;y_, at 
time ¢— 1. The weighted sum of the 
output can be as follows: 


h = 0 (WX, + Wi lin + b) 


t 


e Wr, = weight between the 
recurrent hidden node to the 
input node 

e Whrn = weight between itself 
from the previous time step and 
recurrent hidden node 


e b=bias 
e o = nonlinear activation 
function 


However, in training RNN 
models it was found that there exists 
an issue with the model. 


The equation above assists the 
recurrent node of hidden hm at time 
m to itself, while h, at time n may 
shift towards infinity or zero as n-m 
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increases, whether | Wpy | < 1 or | 
Wan |> 1. 


Due to the backpropogation 
error the gradient starts to either 
explode or vanish which makes it 
difficult for RNN to deal with long- 
term dependency problems. 
Therefore, a memory cell replaced 
recurrent hidden nodes as proposed 
by LSTM to solve these problems. 
The memory cell is shown in the 
figure which ® shows dot product, 
whereas ® shows matrix addition 
[21]. A node along with a self- 
connected edge of recurrent node is 
present in a memory cell (that has a 
fixed weight). This makes sure that 
the gradient takes transversely 
numerous steps unaccompanied by 
vanishing} or detonating gradient. 
Based on LSTM units (consisting of 
input gate, for-gate, output gate plus 
candidate cell value), the memory 
cell as well as the output are given 
below. 
f= OW yh + Wo +b) 


i =0(W h +W,,.x,+5,) 


C = tanh(W,, ` h + Wie x, + be) 


C27 CC, 
o, = 0 (Wh. + W, 


ho xo 


h, = o, © tanh(C,) 


t 


a t b,) 
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Fig. 1. Memory cell 
II. Methodolgy 


Figure 2 depicts the flowchart 
for the suggested spectral and 
spatial joint feature network 
(SSJFN). The diagram shows that 
SSJFNs have two main constituents: 
Spectral-LSTM and Spatial-LSTM. 
The spectral values of each pixel in 
a particular HSI were fed to the 
Spectral-LSTM, in terms of learning 
the spectral feature and also to 
subsequently obtain a classification 
outcome. Likewise, every pixel's 
local patch was fed into a spatial 
LSTM for extracting the spatial 
feature in order to get a 
classification result. Eventually, the 
resuls of the two categorizations 
were merged in a weighted sum 
approach to combine the spectral- 
spatial results. Each one of these 


steps is discussed in-depth in the 
subsections that follow. 


A. Spatial-LSTM 


The spatial feature of a pixel was 
extracted keeping in view its 
neighborhood region. Due to the 
presence of hundreds of channels of 
spectral dimenions, they have 
thousands of dimensions at all 
times. A big neighboring area would 
lead to a classifier with an 
excessively large input dimension 
and redundancy [l1]. PCA was 
utilized initially to extract the first 
principal constituent, as inspired by 
the work in [l, 17]. Then, a 
neighborhood Xi € RS x S centered 
on a particular pixel xi was used. 
The rows in this neighborhood were 
then transformed into an S-length 
sequence 


{X} erai asea X }, wherein 
X! denotes the Ith row X,. Lastly, 


the spatial feature X, was extracted 


from the sequence using LSTM. 
The study ended up using LSTM’s 
final output like a new input in the 
layer of SoftMax in the same way 
that spectral features-based 
classifications were used to obtain 
the probability value 


Par = |x). 7e{L 2, .- Ct. 
The configurations of loss function 


and optimization algorithm in 
Spatial-LSTM are identical to those 
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in Spectral-LSTM. Figure 3 depicts 
the hypothesized spatial features- 
based categorization using 
flowchart technique. 


B. Spectral-LSTM 


HSI contain spectral bands in 
hundreds that give various spectral 
features of the object in the same 
region. Spectra exhibit multiple 
variations due to the complex 
circumstances of lighting, sensor 
rotations, and varied atmospheric 
scattering circumstances. As a 
result, robust and invariant 
characteristics must be extracted for 
categorization. Deep architectures 
seemingly have the potential of 
leading towards increasingly greater 
abstract features at higher layers, 
with most abstract features being 
resistant to the most input 
variations. The current research 
employed LSTM to extract spectral 
features for HSI classification using 
the spectral values of distinct 


channels as a sequence of inputs. 
The overview of the suggested 
characteristics 


spectral 


categorization technique is shown in 
Figure 2. To begin with, the authors 
selected the pixel vector x, e R'**, 


where K denotes the number of 
spectral bands in a specific HSI. 
Next, the vector was converted to a 
K-length series F of 


Pg sag X? peen XF Vin which 


t 
x* e R™ represents the pixel value 


of the k-th spectral band. This series 
was then fed one by one into LSTM, 
with the final output being fed to the 
Softmax classifier. Cross entropy 


CE = - $ Ylogy was used as 


the loss function, with Y and Y 
representing the actual and 
estimated labels of a pixel, 
accordingly. Adam algorithm [22] 
was used to modify this loss 
function. Lastly, the authors 
calculated 


PAY Sieh Ee Daath: 
, where C denotes the number of 
classes. 


FEATURE EXTRACTION CLASSIFICATION 


SSJFN 


Fig. 2. Flow chart for the proposed SSJFN 
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C. Joint Spatial-Spectral accurate classification needs 
Classification simultaneous consideration of 
spatial and spectral information [7]. 


The spectral and spatial ; . : 
3 P Using the maximum posterior 


features-based categorization 
techniques were introduced in the Probabilities Fy (y =j fa) and 
two subsections above. HSI with p 
very high spatial resolution may  ”” 
now be obtained with current 
sensors, thanks to the development 
of image spectroscopy technologies 
[23]. As a result, the pixels in a 
limited spatial neighborhood are 
likely to belong to the same class. ? (y = 1%) = Wye) = Zl) + Miyata (Y = j1), 
While, the pixels in a large 
homogenous region may have . : 

ee spectral ea The weights that satisfy 
pixels grouped into distinct Wwe + Wy. = 1 . The current 
subregions in case one utilize the research utilized constant weights as 
characteristics of the specter solely. the approach for convenience, 
On the other hand, if spatial 
information is utilized solely to 
classify numerous surrounding 
regions, they would all be 
categorized as the same. Hence, 


( y=ji| ae an obvious way for 
merging the spatial and spectral 
features is to combine the outcomes 
from the above equations including 
their sum manner weight that may 
be expressed as 


where, w,,, and w, are fusion 


1 
namely We = Wy, = > 


Fig. 3. IP Dataset: False-color composite with ground-truth image 
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Table I 
Number of Pixels for Training/Testing and The Total Number of Pixels 
for Each Class in IP Ground Truth Map 


Class Class name Training Test 
1 Alfalfa 30 16 
2 Corn-notill 150 1278 
3 Corn-mintill 150 680 
4 Corn 100 137 
5 Grass-pasture 150 333 
6 Grass-tree 150 580 
7 Grass-pasture-mowed 20 8 
8 Hay-windrowed 150 328 
9 Oats 15 5 
10 Soybean-notill 150 822 
11 Soybean-mintill 150 2305 
12 Soybean-clean 150 443 
13 Wheat 150 55 
14 Woods 150 1115 


Buildings-Grass- 


E Trees-Drives a RI 
16 Stone-Steel-Towers 50 43 
Total 1765 8484 


ARAR ARQGR GB 


Fig. 4. PU dataset: False-colour Composite with Ground-truth Image 


UMT— 63 


Department of Information Systems s 
Volume 2 Issue 1, Spring 2022 ied 


Deep Feature Learning... 


Table II 
Number of Pixels for Training/Testing and The Total Number Of Pixels 
for Each Class in PUS Ground Truth Map 


Class Class name Training Test 
1 Asphalt 548 6083 
2 Meadows 540 18109 
3 Gravel 392 1707 
4 Trees 542 2522 
5 Metal sheets 256 1089 
6 Bare soil 532 4497 
7 Bitumen 375 955 
8 Bricks 514 3168 
9 Shadows 231 716 

Total 3930 38846 


III. Experimental Results 
A. Datasets 


We put the suggested method to 
test on three well-known HSI 
datasets commonly used to assess 
classification methods. 


The first dataset, that is, Indian 
Pines (IP) which covers 224 bands 
of spectral energy, was captured via 
the sensor of AVIRIS on June 12, 
1992 on the Indian Pine test site in 
northwestern Indiana, USA. Two- 
hundred bands were used after 
deleting four bands having zero 
value as well as twenty bands 
influenced with water absorptive 
way. The image consists of a spatial 
resolution of 20 m and a spatial size 


of 145 x 145 pixels. Figure 5 shows 
the ground-truth map and the false- 
color hybrid image. Table I shows 
the total number of samples 
provided, which ranges from 20 to 
2455 in each class. 


Pavia University (PUS): The 
ROSIS sensor collected the second 
dataset on July 8, 2002 during a 
flight campaign over Pavia, 
northern Italy. The original image 
has 115 spectral channels with 
wavelengths of the range 0.43-0.86 
m. Following the removal of noisy 
bands, 103 bands were utilized. The 
image has a resolution of 1.3 m and 
a size of 610 x 340 pixels. A three- 
band false-color composite image, 
as well as the ground-truth map, is 
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shown in Figure 3. There are nine 
classifications of land coverings in 
the ground-truth map, each of which 
goes by the extension of 1000 
labeled pixels, as displayed in Table 
Il. 


B. Experimental Setups 


The performance of Spectral- 
LSTM, Spatial-LSTM, and SSJFNs 
was evaluated, both quantitatively 
and qualitatively, to illustrate the 
usefulness of the suggested LSTM- 
based categorization approach. 
They were also compared against 
several state-of-the-art approaches, 
including PCA, LDA, CNN, ELN? 
-RegMLR, RNN-LSTM, and RNN- 
GRU-PRetanh [10, 16]. The 
researchers also utilized original 
pixels as a benchmark, directly. To 
solve a singular problem in LDA, 
the within-class scatter matrix Sẹ is 


substituted with S, + €I, where e 


= 10-3. [2, 24] is used to select the 
best deducted dimensions for PCA, 
LDA, NWFE, and RLDE. The best 
window size for MDA is chosen 
from a set of three, five, seven, nine, 
and eleven [25]. The number of 
layers and filter sizes remain the 
same for CNN as they are for 
networking [26]. The researcher 
only utilized a confined layer in 
LSTM. Moreover, a numerical 
range of optimal confined nodes 
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was also chosen among the set of 
{16, 32, 64, 128, 256}. 


Furthermore, 10% of pixels per 
class were chosen at random for 
training sets, while the remaining 
pixels were for the datasets of IP 
selected as testing sets. The authors 
chose 3921 pixels at random as the 
training set, with the remaining 
pixels were selected as the testing 
set for the PUS dataset [27]. Tables 
1-3 show the precise training 
numbers and testing samples. Every 
algorithm was run five times to limit 
the impact of random selection and 
the average results are provided. 
Overall accuracy (OA), average 
accuracy (AA), per-class accuracy, 
and Kappa coefficient were used to 
assess classification performance. 
OA is the percentage of agreement 
adjusted by the number of 
agreements that would be expected 
simply by chance. While, AA is 
accuracy average of all the classes. 


C. Parameter Selections 


The size of neighboring areas 
and the number of hidden nodes are 
two crucial criteria in the proposed 
classification system. To begin 
with, hidden node numbers were 
specified and the best area size was 
chosen from a range of {8 x 8, 16 x 
16, 32 x 32, 64 x 64}. Table IV 
shows the OAs of the SSLSTMs 
technique. Regarding the PUS 
dataset, it can be seen that as the 
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region size grows larger, OA 
increases at first then it decreases, 
nominating 32 x 32 as the best size. 
OA grows as the size of IP and KSC 
datasets becomes larger. Larger 
sizes, on the other hand, 
dramatically increase processing 
time. Thus, for both IP and KSC 
datasets, the optimal size was 
determined to be 64 x 64. 


Secondly, region size is fixed 
and uses four alternative 
combinations of {16, 32}, {32, 64}, 
{64, 128}, and {128, 256} to find 
the the optimal number of hidden 
nodes for Spectral-LSTM and 
Spatial-LSTM. The SSJFN 
approach obtains maximum OA on 
the IP dataset when the number of 
hidden nodes for Spectral-LSTM 
and Spatial-LSTM is set to 64 and 
128 accordingly, as shown in Table 
V. Whenever, the number of 
conceived nodes for Spectral-LSTM 
and Spatial-LSTM are organized up 
to 128 and 256, SSJFNs 
correspondingly achieve the highest 
OA on the PUS datasets. 


D. Performance Comparison 


Table III shows the OA and AA 
for multiple classification methods 
applied on the Indian Pines dataset. 


OA, as well as AA, are the lowest 
for PCA among all the available 
methods. This is because PCA does 
not take into account spatial features 
while classifying spectral features. 
Moreover, LDA shows better 
accuracy because it uses labeled 
data for training. Furthermore, 
MDA and RIDE are comparatively 
better than previous LDA-based 
methods because of their ability to 
use spectral and spatial features, 
simultaneously. CNN, for the same 
reason, outperforms all of them as it 
uses spatial and spectral features for 
making its predictions. It is 
noteworthy that ELN?-RegMLR 
showed the second best results with 
an overall accuracy of 97.93%, 
followed by RNN-GRU-PRetanh, 
and RNN-LSTM. This makes it 
evident that spatial features are 
equally important while classifying 
spectral objects. The proposed 
algorithm, that is, SSJFN 
outperforms all its predecessors, 
including regularized machine 
learning algorithms, such as ELN?- 
RegMLR and deep neural networks, 
such as RNN-LSTM. The reason is 
that it incorporates all the features 
and also has the ability to capture 
the non-linear distribution of 
hyperspectral data. 
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Table II 
Classification Accuracy (%) for the Indian Pines Image Using Training and 
Testing Samples 


. RNN- aig 
Class Orignal PCA LDA CNN EIN’ -RegMLR LSTM GRU- SSJFN 
PRetanh 
1 56.96 59.57 63.04 73.17 97.92 46.03 70.59 99.87 
2 79.75 68.75 72.04 93.48 96.67 61.73 70.28 98.16 
3 66.60 53.95 57.54 84.02 97.07 86.96 81.52 99.24 
4 59.24 55.19 46.58 83.57 100.0 87.02 90.16 99.81 
5 90.31 83.85 91.76 96.69 98.21 86.66 91.97 98.97 
6 95.78 91.23 94.41 99.15 99.40 97.49 96.13 98.51 
J 80.00 82.86 72.14 93.60 13.04 59.69 84.75 98.08 
8 97.41 93.97 98.74 99.91 100.0 64.89 59.64 99.91 
9 35.00 34.00 26.00 63.33 22.27 60.46 86.17 99.44 
10 66.32 64.18 60.91 82.15 99.68 98.77 99.38 100.0 
11 70.77 74.96 76.45 92.76 98.55 75.32 84.75 100.0 
12 64.42 41.72 67.45 91.35 98.42 71.82 77.58 99.76 
13 95.41 93.46 96.00 99.13 98.97 91.11 95.56 97.98 
14 92.66 89.45 93.79 98.22 100.0 79.49 84.62 99.80 
15 60.88 47.77 65.54 87.84 96.49 90.91 90.91 99.59 
16 87.53 88.17 83.66 94.29 95.29 100.0 100.0 99.86 
OA 7744 72.58 76.67 80.97 97.93 80.52 88.63 98.37 
AA 74.94 70.19 72.88 80.94 88.41 78.65 85.26 94.63 
K 74.32 68.58 73.27 78.25 97.64 63.72 73.66 98.10 
Department of Information Systems S UMT— 67 
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ELN-RegMLR RNN-LSTM RNN-GRU-PRetanh SSJFN 


Fig. 5. Classification maps on the IP dataset 


Table IV 
Classification Accuracy (%) for the University of Pavia Image 
Using Training Samples and Testing Samples 


= RNN- RNN- 
Class Original PCA LDA CNN &ELN -RegMLR LSTM GRU- SSJFN 
PRetanh 
1 87.25 87.07 82.91 96.72 96.09 77.45 84.45 99.89 
2 89.10 88.38 80.68 96.31 89.32 61.83 85.24 93.01 
3 81.99 81.96 69.21 97.15 99.65 64.60 54.31 99.66 
4 95.65 95.14 95.99 96.16 94.69 97.98 95.17 99.19 
5 99.76 99.76 99.90 99.81 99.32 99.18 99.93 98.79 
6 88.78 88.06 89.53 94.87 99.96 91.19 80.99 98.59 
7 85.92 85.32 81.11 97.44 100.0 90.90 88.35 99.80 
8 86.14 86.06 85.81 98.23 98.96 92.29 88.62 99.93 
9 99.92 99.92 99.92 98.04 99.02 97.47 99.89 99.14 
OA 89.12 88.63 76.67 96.55 97.93 77.99 88.85 98.91 
AA 90.50 90.18 72.88 97.19 88.41 85.88 86.33 98.63 
K 85.81 85.18 73.27 95.30 97.64 70.28 80.48 97.91 
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Similarly, Table IV shows the 
OA and AA for the PUS dataset for 
multiple training models. In this 
case, LDA shows the lowest 
accuracy with 76.67%, followed 
closely by RNN-LSTM which gives 
the OA of 77.99%. However, the 
AA of RNN-LSTM is much better 
at 85.88%. Akin to the IP dataset, 
CNN, ELN?-RegMLR, and SSJFN 
outperform all others showing an 
accuracy of 96.55%, 97.93%, and 
98.91%, respectively. Moreover, 
AA, as well as K, follow the same 
trend. 


IV. Conclusion 


An HSI classification strategy 
that has its basis in an LSTM 
network was discussed in this study. 
The spectral and spatial 
characteristics extracting 
complications were perceived as 
problems involving sequence 
learning, while LSTM was used to 
tackle those challenges of gradients, 


long-term dependencies, 
information extraction and 
classification performance. To 
understand spectral properties 


regarding a specific pixel in the 
respective HSI, its spectral values in 
multiple channels were fed into 
LSTM, individually. A common 
patch of images concerned with the 
center of the pixel was first 
extracted from the first basic 
element of the HSI for spatial 


Department of Information Systems 


feature extraction. Later on, the 
lines of the row of the image patch 
remained as input to LSTM The 
researchers compared the methods 
to state-of-the-art-methods such as 
CNN by carrying out experimental 
reviews on the three HSI obtained 
by various equipment (A VIRIS and 
ROSIS). In comparison to merely 
employing spectral-informed 
values, the experimental results 
showed that combining spectral and 
spatial information enhances 
performance in categorization and 
provides results in greater 
homogeneous regional values in the 
classified maps. 
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